Equalizers

ABSTRACT

The linear equalizer is implemented utilizing second-order statistics, assuming two classes that have normal distributions, which improves the convergence speed and performance significantly. The idea and teaching of the present invention is also extended with equalization viewed as a multi-class classification problem. Furthermore, the idea and teaching of the present invention is extended to blind equalization.

BACKGROUND OF THE INVENTION

1. Field of the Invention

This invention relates to equalizers that estimate the original signal in the presence of noise, delay and interference.

2. Description of the Related Art

If signal x(n) is transmitted through a linear dispersive channel, the received signal y(n) can be modeled by

$\begin{matrix} {{y(n)} = {{\sum\limits_{k = {- L}}^{L}{a_{k}{x\left( {n - k} \right)}}} + {e(n)}}} & (1) \end{matrix}$ where e(n) is the additive white Gaussian noise which might be modeled by the following statistics: E[e(n)]=0, E[e(n)e(m)]=σ_(e) ²δ(n−m).

It is further assumed that signal x(n)is a binary signal (1, −1) and an equi-probable and independent sequence with the following statistics: E[x(n)]=0, E[x(n)x(m)]=σ_(x) ²δ(n−m).

As can be seen in equation (1), 2L+1 signals {y(n−L), y(n−L+1), . . . , y(n), . . . , y(n+L−1), y(n+L)} contain some information on x(n) and can be used to estimate x(n). These 2L+1 signals can be represented as a vector as follows: Y(n)=[y(n−L),y(n−L+1), . . . ,y(n), . . . ,y(n+L−1),y(n+L)]^(T).

Furthermore, {x(n−2L), x(n−2L+1), . . . , x(n−1), x(n)} will have some effects on y(n−L) and {x(n), x(n+1), . . . , x(n+2L−1), x(n+2L)} on y(n+L). Thus, it can be said that {x(n−2L), x(n−2L+1), . . . , x(n), . . . , x(n+2L−1), x(n+2L)} affect the estimation of x(n) at the receiver. As stated previously, input vector X(n) and noise vector N(n) are defined as follows: X(n)=[x(n−2L),x(n−2L+1), . . . ,x(n), . . . ,x(n+2L−1),x(n+2L)]^(T) N(n)=[e(n−L),e(n−L+1), . . . ,e(n), . . . ,e(n+L−1),e(n+L)]^(T).

It is noted that the dimension of the input vector, X(n), is 4L+1. This analysis can be easily extended to non-symmetric channels.

Equalization has been important in communications and data storage, and numerous algorithms have been proposed. Among various equalization methods, linear equalization has been widely used due to its speed and simplicity. The linear equalizer is frequently implemented using the LMS algorithm as follows: z[n]=W ^(T)(n)Y(n) where Y(n)=[y(n−L),y(n−L+1), . . . , y(n), . . . , y(n+L−1),y(n+L)]^(T), z[n] is an output of the equalizer, and W(n)=[w_(−L), w_(−L+1), . . . , w, . . . , w_(L−1), w_(L)]^(T) is a weight vector. The weight vector is updated as follows: W(n+1)=W(n)+cλY(n) where λ is the learning rate, c is 1 if signal 1 is transmitted and -1 if signal -1 is transmitted.

In the present invention, the linear equalizer is implemented utilizing second-order statistics considering equalization as a classification problem. As a result, the convergence speed and the performance are significantly improved.

SUMMARY OF THE INVENTION

Equalization is an important topic in communications and data storage. In the present invention, equalization is viewed as a two-class classification problem and the linear equalizer is implemented utilizing second order statistics. As a result, the convergence speed and the performance are significantly improved.

Furthermore, the idea and teaching of the present invention is extended when equalization is viewed as a multi-class classification problem.

Still furthermore, the idea and teaching of the present invention is extended to blind equalization.

Thus, it is an object of the present invention to provide linear equalization methods that provide a fast processing time and improved performance.

The other objects, features and advantages of the present invention will be apparent from the following detailed description.

BRIEF DESCRIPTION OF THE DRAWING

FIG. 1 shows the distribution of Y₁ and Y⁻¹ in the y_(n)−y_(n−1) space.

FIG. 2 shows the distribution of Y₁ and Y⁻¹ in the y_(n+1)−y_(n−1) space.

FIG. 3 illustrates how Y₁(i) and Y⁻¹(i) are constructed.

FIG. 4 a illustrates how the equalizer classifies three samples at one time by treating three received samples as one state.

FIG. 4 b illustrates how the performance of the equalizer can be enhanced by classifying one sample at a time.

DESCRIPTION OF THE ILLUSTRATED EMBODIMENTS Embodiment 1

If signal x(n) is transmitted through a linear dispersive channel, the received signal y(n) can be modeled by

$\begin{matrix} {{y(n)} = {{\sum\limits_{k = {- L}}^{L}{a_{k}{x\left( {n - k} \right)}}} + {{e(n)}.}}} & (2) \end{matrix}$

Although there can be a delay due to the distance between the transmitter and the receiver, such a delay may be assumed to be zero without loss of generality since the delay just represents a time shift at the receiver. For an easy illustration, it is assumed that L=2. It is further assumed that the channel is symmetric (a_(k)=a_(−k)) in equation (2). However, it is noted that the idea and teaching of the present invention can be applied to any value of L and to any non-symmetric channel modeled by

$\begin{matrix} {{y(n)} = {{\sum\limits_{k = {- L_{1}}}^{L_{2}}{a_{k}{x\left( {n - k} \right)}}} + {{e(n)}.}}} & (3) \end{matrix}$

If L=2, {y(n−2), y(n−1), y(n), y(n+1), y(n+2)} will be affected by x(n). In other words, {y(n−2), y(n−1), y(n), y(n+1), y(n+2)} contain information on x(n) and can be used to estimate x(n). Furthermore, as can be seen in equation (2), y(n+2) and y(n−2) are obtained as follows:

$\begin{matrix} {{y\left( {n + 2} \right)} = {{\sum\limits_{k = {- L}}^{L}{a_{k}{x\left( {n + 2 - k} \right)}}} + {e\left( {n + 2} \right)}}} \\ {= {{a_{2}{x(n)}} + {a_{1}{x\left( {n + 1} \right)}} + {a_{0}{x\left( {n + 2} \right)}} +}} \\ {{a_{1}{x\left( {n + 3} \right)}} + {a_{2}{x\left( {n + 4} \right)}} + {e\left( {n + 2} \right)}} \\ {{y\left( {n - 2} \right)} = {{\sum\limits_{k = {- L}}^{L}{a_{k}{x\left( {n - 2 - k} \right)}}} + {{e\left( {n - 2} \right)}.}}} \\ {= {{a_{2}{x\left( {n - 4} \right)}} + {a_{1}{x\left( {n - 3} \right)}} + {a_{0}{x\left( {n - 2} \right)}} +}} \\ {{a_{1}{x\left( {n - 1} \right)}} + {a_{2}{x(n)}} + {{e\left( {n - 2} \right)}.}} \end{matrix}$

Therefore, in order to estimate x(n), {y(n−2), y(n−1), y(n), y(n+1), y(n+2)} should be used and {x(n−4), x(n−3), x(n−2), x(n−1), x(n), x(n+1), x(n+2), x(n+3), x(n+4)} have effects on {y(n−2), y(n−1), y(n), y(n+1), y(n+2)}. As a result, it can be observed that {x(n−4), x(n−3), x(n−2), x(n−1), x(n), x(n+1), x(n+2), x(n+3), x(n+4)} affect the estimation of x(n) at the receiver.

Furthermore, {y(n−2), y(n−1), y(n), y(n+1), y(n+2)} can be computed in matrix form as follows:

$\begin{matrix} \begin{matrix} {\begin{bmatrix} {y\left( {n - 2} \right)} \\ {y\left( {n - 1} \right)} \\ {y(n)} \\ {y\left( {n + 1} \right)} \\ {y\left( {n + 2} \right)} \end{bmatrix} = {{\left\lbrack \begin{matrix} a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 & 0 & 0 \\ 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 & 0 \\ 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 \\ 0 & 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 \\ 0 & 0 & 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} \end{matrix} \right\rbrack\begin{bmatrix} {x\left( {n - 4} \right)} \\ {x\left( {n - 3} \right)} \\ {x\left( {n - 2} \right)} \\ {x\left( {n - 1} \right)} \\ {x(n)} \\ {x\left( {n + 1} \right)} \\ {x\left( {n + 2} \right)} \\ {x\left( {n + 3} \right)} \\ {x\left( {n + 4} \right)} \end{bmatrix}} +}} \\ {\begin{bmatrix} {e\left( {n - 2} \right)} \\ {e\left( {n - 1} \right)} \\ {e(n)} \\ {e\left( {n + 1} \right)} \\ {e\left( {n + 2} \right)} \end{bmatrix}.} \end{matrix} & (4) \end{matrix}$

When L=2, input vector X(n), received vector Y(n), and noise vector N(n) a time n are defined as follows: Y(n)=[y(n−2),y(n−1),y(n),y(n+1),y(n+2)]^(T), X(n)=[x(n−4),x(n−3),x(n−2),x(n−1),x(n),x(n+1),x(n+2),x(n+3),x(n+4)]^(T)., N(n)=[e(n−2),e(n−1),e(n),e(n+1),e(n+2)]^(T).

In addition, matrix A is defined as follows:

$A = {\left\lbrack \begin{matrix} a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 & 0 & 0 \\ 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 & 0 \\ 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 \\ 0 & 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 \\ 0 & 0 & 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} \end{matrix} \right\rbrack.}$

Then, equation (4) can be rewritten as Y(n)=AX(n)+N(n).

In general, the size of matrix A will be (2L+1)×(4L+1), input vector X(n) (4L+1)×1 and received vector Y(n) (2L+1)×1. Furthermore, the input vector X(n), the received vector Y(n) and the noise vector N(n) can be viewed as random vectors, assuming a stationary process.

According to the idea and teaching of the present invention, equalization is viewed as a classification problem where the equalizer classifies the received vector Y(n) as one of the binary states (1, −1). If state 1 is transmitted at time ii, the input vector is given by X ₁ :X(n)=[x(n−4),x(n−3),x(n−2),x(n−1), 1,x(n+1),x(n+2),x(n+3),x(n+4)]^(T) where X₁ represents a random vector and the subscript of random vector X₁ indicates that state 1 is transmitted (class ω₁). It is noted that X₁ is to be understood as a random vector and X(n) represents a sample vector of the input vector at time n. Furthermore, it is noted that the idea and teaching of the present invention can be easily applied even when there are more than 2 states. The mean vector and covariance matrix of X₁ are given by μ_(X) ₁ =E{X ₁}=[0,0,0,0,1,0,0,0,0]^(T)

$\begin{matrix} {\sum\limits_{x_{1}}{= {E\left\{ {\left( {X_{1} - \mu_{X_{1}}} \right)\left( {X_{1} - \mu_{X_{1}}} \right)^{T}} \right\}}}} \\ {= {{Diag}\left( {1,1,1,1,0,1,1,1,1} \right)}} \\ {= \left\lbrack \begin{matrix} 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{matrix} \right\rbrack} \\ {= {I - {\mu_{X_{1}}\left( \mu_{X_{1}} \right)}^{T}}} \end{matrix}$ where I is an identity matrix since it is assumed that signal x(n) is a binary signal (1,−1) and that x(n) is an equi-probable and independent sequence. Although it is assumed that x(n) is a binary signal, the teaching and idea of the present invention are still valid for other types of signals.

If state −1 is transmitted, the input vector is represented by X ⁻¹ :X(n)=[x(n−4),x(n−3),x(n−2),x(n−1), −1,x(n+1),x(n+2),x(n+3),x(n+4)]^(T) where X⁻¹ represents a random vector and the subscript of random vector X⁻¹ indicates that state −1 is transmitted (class ω₂). The mean vector and covariance matrix of X⁻¹ are given by μ_(X) ⁻¹ =E{X ⁻¹}=[0,0,0,0,−1,0,0,0,0]^(T)=μ_(X) ₁ Σ_(X) ⁻¹ =E{(X ⁻¹−μ_(X−1))(X ⁻¹−μ_(X−1))^(T)}=Diag(1,1,1,1,0,1,1,1,1)=Σ_(X) ₁ .

On the other hand, if state 1 is transmitted, the received vector can be computed as

$\begin{matrix} {{Y_{1}:{Y(n)}} = \begin{bmatrix} {y\left( {n - 2} \right)} \\ {y\left( {n - 1} \right)} \\ {y(n)} \\ {y\left( {n + 1} \right)} \\ {y\left( {n + 2} \right)} \end{bmatrix}} \\ {= {{\left\lbrack \begin{matrix} a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 & 0 & 0 \\ 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 & 0 \\ 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 \\ 0 & 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 \\ 0 & 0 & 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} \end{matrix} \right\rbrack\begin{bmatrix} {x\left( {n - 4} \right)} \\ {x\left( {n - 3} \right)} \\ {x\left( {n - 2} \right)} \\ {x\left( {n - 1} \right)} \\ 1 \\ {x\left( {n + 1} \right)} \\ {x\left( {n + 2} \right)} \\ {x\left( {n + 3} \right)} \\ {x\left( {n + 4} \right)} \end{bmatrix}} +}} \\ {\begin{bmatrix} {e\left( {n - 2} \right)} \\ {e\left( {n - 1} \right)} \\ {e(n)} \\ {e\left( {n + 1} \right)} \\ {e\left( {n + 2} \right)} \end{bmatrix}} \end{matrix}$ where the subscript of Y₁ indicates that state 1 is transmitted (class ω₁). In this paradigm, the received vector, Y₁, can also be viewed as a random vector. It is noted that Y₁(n) represents a sample vector of random vector Y₁ with x(n)=1. The mean vector and covariance matrix of Y₁ are given by

$\begin{matrix} {\mu_{Y_{1}} = {{A\;\mu_{X_{1}}} = \left\lbrack {a_{2},a_{1},a_{0},a_{1},a_{2}} \right\rbrack^{T}}} \\ {\sum\limits_{Y_{1}}{= {E\left\{ {\left( {Y_{1} - \mu_{Y_{1}}} \right)\left( {Y_{1} - \mu_{Y_{1}}} \right)^{T}} \right\}}}} \\ {= {E\left\{ {\left( {{AX}_{1} - {A\;\mu_{X_{1}}} + N} \right)\left( {{AX}_{1} - {A\;\mu_{X_{1}}} + N} \right)^{T}} \right\}}} \\ {= {E\left\{ {{\left( {{AX}_{1} - {A\;\mu_{X_{1}}}} \right)\left( {{AX}_{1} - {A\;\mu_{X_{1}}}} \right)^{T}} + {\left( {{AX}_{1} - {A\;\mu_{X_{1}}}} \right)N^{T}} +} \right.}} \\ \left. {{N\left( {{AX}_{1} - {A\;\mu_{X_{1}}}} \right)}^{T} + {NN}^{T}} \right\} \\ {= {{E\left\{ {\left( {{AX}_{1} - {A\;\mu_{X_{1}}}} \right)\left( {{AX}_{1} - {A\;\mu_{X_{1}}}} \right)^{T}} \right\}} + {E\left\{ {\left( {{AX}_{1} - {A\;\mu_{X_{1}}}} \right)N^{T}} \right\}} +}} \\ {E\left\{ {{N\left( {{AX}_{1} - {A\;\mu_{X_{1}}}} \right)}^{T} + {E\left\{ {NN}^{T} \right\}}} \right.} \\ {= {{E\left\{ {{A\left( {X_{1} - \mu_{X_{1}}} \right)}\left( {X_{1} - \mu_{X_{1}}} \right)^{T}A^{T}} \right\}} + {E\left\{ {NN}^{T} \right\}} +}} \\ {{E\left\{ {\left( {{AX}_{1} - {A\;\mu_{X_{1}}}} \right)N^{T}} \right\}} + {E\left\{ {N\left( {{AX}_{1} - {A\;\mu_{X_{1}}}} \right)}^{T} \right\}}} \\ {= {{A{\sum\limits_{X_{1}}A^{T}}} + {\sigma_{e}^{2}I} + {{AE}\left\{ {X_{1}N^{T}} \right\}} - {A\;\mu_{X_{1}}E\left\{ N^{T} \right\}} +}} \\ {{E\left\{ {NX}_{1}^{T} \right\} A^{T}} - {E\left\{ N \right\}\mu_{X_{1}}^{T}A^{T}}} \end{matrix}$ where N is the noise vector. Since AE{X₁N^(T)}=Aμ_(X) ₁ E{N^(T)}=E{NX₁ ^(T)}A^(T)=E{N}μ_(X) ₁ ^(T)A^(T)=0, the covariance matrix of Y₁ is given by

$\begin{matrix} {\underset{Y_{1}}{\sum\;} = {{A{\sum\limits_{X_{1}}A^{T}}} + {\sigma_{e}^{2}I}}} \\ {= {\left\lbrack \begin{matrix} {{a_{2}a_{2}} + {2a_{1}a_{1}} + {a_{0}a_{0}} + \sigma_{e}^{2}} & {{a_{1}a_{2}} + {2a_{0}a_{1}}} & {{a_{0}a_{2}} + {a_{1}a_{1}}} & {a_{1}a_{2}} & 0 \\ {{a_{1}a_{2}} + {2a_{0}a_{1}}} & {{2a_{2}a_{2}} + {a_{1}a_{1}} + {a_{0}a_{0}} + \sigma_{e}^{2}} & {{2a_{1}a_{2}} + {a_{0}a_{1}}} & {2a_{0}a_{2}} & {a_{1}a_{2}} \\ {{a_{0}a_{2}} + {a_{1}a_{1}}} & {{2a_{1}a_{2}} + {a_{0}a_{1}}} & {{2a_{2}a_{2}} + {2a_{1}a_{1}} + \sigma_{e}^{2}} & {{2a_{1}a_{2}} + {a_{0}a_{1}}} & {{a_{0}a_{2}} + {a_{1}a_{1}}} \\ {a_{1}a_{2}} & {2a_{0}a_{2}} & {{2a_{1}a_{2}} + {a_{0}a_{1}}} & {{2a_{2}a_{2}} + {a_{1}a_{1}} + {a_{0}a_{0}} + \sigma_{e}^{2}} & {{a_{1}a_{2}} + {2a_{0}a_{1}}} \\ 0 & {a_{1}a_{2}} & {{a_{0}a_{2}} + {a_{1}a_{1}}} & {{a_{1}a_{2}} + {2a_{0}a_{1}}} & {{a_{2}a_{2}} + {2a_{1}a_{1}} + {a_{0}a_{0}} + \sigma_{e}^{2}} \end{matrix} \right\rbrack.}} \end{matrix}$

Similarly, if state −1 is transmitted, the received vector can be computed as

$\begin{matrix} {{Y_{- 1}(n)} = \begin{bmatrix} {y\left( {n - 2} \right)} \\ {y\left( {n - 1} \right)} \\ {y(n)} \\ {y\left( {n + 1} \right)} \\ {y\left( {n + 2} \right)} \end{bmatrix}} \\ {= \begin{bmatrix} a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 & 0 & 0 \\ 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 & 0 \\ 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 \\ 0 & 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 \\ 0 & 0 & 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} \end{bmatrix}} \\ {\begin{bmatrix} {x\left( {n - 4} \right)} \\ {x\left( {n - 3} \right)} \\ {x\left( {n - 2} \right)} \\ {x\left( {n - 1} \right)} \\ {- 1} \\ {x\left( {n + 1} \right)} \\ {x\left( {n + 2} \right)} \\ {x\left( {n + 3} \right)} \\ {x\left( {n + 4} \right)} \end{bmatrix} + \begin{bmatrix} {e\left( {n - 2} \right)} \\ {e\left( {n - 1} \right)} \\ {e(n)} \\ {e\left( {n + 1} \right)} \\ {e\left( {n + 2} \right)} \end{bmatrix}} \end{matrix}$ where the subscript of Y⁻¹ indicates that state −1 is transmitted (class ω₂). As explained previously, the received vector, Y⁻¹, can also be viewed as a random vector. When state −1 is transmitted, the mean vector and covariance matrix of Y⁻¹ are given by μ_(Y) ⁻¹ =Aμ _(X) ⁻¹ =−[a ₂ ,a ₁ ,a ₀ ,a ₁ ,a ₂]^(T)=−μ_(Y) ₁

$\begin{matrix} {\Sigma_{Y_{- 1}} = {E\left\{ {\left( {Y_{- 1} - \mu_{Y_{- 1}}} \right)\left( {Y_{- 1} - \mu_{Y_{- 1}}} \right)^{T}} \right\}}} \\ {= {E\left\{ {\left( {{A\; X_{- 1}} - {A\;\mu_{X_{- 1}}} + N} \right)\left( {{A\; X_{- 1}} - {A\;\mu_{X_{- 1}}} + N} \right)^{T}} \right\}}} \\ {= {{A\;\Sigma_{X_{- 1}}A^{T}} + {\sigma_{e}^{2}I}}} \\ {= \Sigma_{Y_{1}}} \end{matrix}$

It can be seen that μ_(Y) ⁻¹ =−μ_(Y) ⁻¹ and Σ_(Y) ₁ =Σ_(Y) ⁻¹ . Assuming L=2, the received vectors, Y₁ and Y⁻¹, are distributed in a 5-dimensional space. FIG. 1 shows the distribution of Y₁ and Y⁻¹ in the y_(n)−y_(n−1) space assuming that a₀=1, a₁=0.5, a₂=0.2, and e(n)=0 and FIG. 2 shows the distribution of Y₁ and Y⁻¹ in the Y_(n+1)−Y⁻¹ space. If the Gaussian distribution is assumed for Y₁ and Y⁻¹, the decision boundary between Y₁ and Y⁻¹ is a hyper-plane since the covariance matrices are identical. Consequently, the optimal linear classifier is given by w=Σ _(Y) ⁻¹(μ_(Y) ₁ −μ_(Y) ⁻¹ )=2Σ_(Y) ⁻¹μ_(Y) ₁ =2Σ_(Y) ⁻¹ [a ₂ ,a ₁ ,a ₀ ,a ₁ ,a ₂]^(T)   (5) where Σ_(Y)=AΣ_(X) ₁ A^(T)+σ_(e) ²1. In other words, if {a_(k)} and σ_(e) ² are given, the optimal linear equalizer is obtained using equation (5). In this case, the decision rule is as follows:

-   -   if w^(T)Y(n)>0, then decide that state 1 is transmitted,     -   if w^(T)Y(n)<0, then decide that state −1 is transmitted.

If w^(T)Y(n)=0, one can decide the received signal as either state 1 or state −1. Alternatively, one can reject it. In practice, such a case would be rare. It is noted that equation (5) can be used even if the Gaussian distribution is not assumed for the Y₁ and Y⁻¹.

Embodiment 2

In many real world problems, the coefficients {a_(k)} and the noise power σ_(e) ² of equations (2) and (3) are unknown, and they should be estimated from training samples. Typically, at the beginning of communication or at a regular interval, the transmitter transmits a training sequence, which is also known at the receiver. By analyzing the received signals, the receiver estimates the channel characteristics. For instance, the linear equalizer is implemented using the following LMS algorithm: z[n]=W ^(T)(n)Y(n) where Y(n)=[y(n−L),y(n−L+1), . . . ,y(n), . . . ,y(n+L−1),y(n+L)]^(T), z[n] is an output of equalizer, and W(n)=[w_(−L),w_(−L+1), . . . ,w, . . . , w_(L−1),w_(L)]^(T) is a weight vector. The weight vector is updated as follows: W(n+1)=W(n)+cλY(n) where λ is the learning rate, c is 1 if signal 1 is transmitted and −1 if signal −1 is transmitted. As can be seen, the weight vector is updated using the training sequences that are transmitted at the beginning of communication or at a regular interval.

The idea and teaching of the present invention can be applied to this situation. From the training sequence, the mean vector and the covariance matrix of Y₁ can be estimated as follows:

$\begin{matrix} {{\hat{\mu}}_{Y_{1}} = {\frac{1}{N_{1}}{\sum\limits_{i = 1}^{N_{1}}\;{Y_{1}(i)}}}} \\ {{\hat{\Sigma}}_{Y_{1}} = {\frac{1}{N_{1}}{\sum\limits_{i = 1}^{N_{1}}{\left( \;{{Y_{1}(i)} - {\hat{\mu}}_{Y_{1}}} \right)\left( \;{{Y_{1}(i)} - {\hat{\mu}}_{Y_{1}}} \right)^{T}}}}} \end{matrix}$ where Y₁(i) is the i-th received vector of class ω₁ (state 1 is transmitted), the subscript of Y₁(i) indicates that state 1 is transmitted and N₁ is the number of vectors corresponding to class ω₁ that can be obtained from the training sequence. It is noted that Y₁ is a random vector and Y₁(i) is the i-th sample vector. Similarly, the mean vector and the covariance matrix of Y⁻¹ can be estimated as follows:

$\begin{matrix} {{\hat{\mu}}_{Y_{- 1}} = {\frac{1}{N_{- 1}}{\sum\limits_{i = 1}^{N_{- 1}}\;{Y_{- 1}(i)}}}} \\ {{\hat{\Sigma}}_{Y_{- 1}} = {\frac{1}{N_{- 1}}{\sum\limits_{i = 1}^{N_{- 1}}{\left( \;{{Y_{- 1}(i)} - {\hat{\mu}}_{Y_{- 1}}} \right)\left( \;{{Y_{- 1}(i)} - {\hat{\mu}}_{Y_{- 1}}} \right)^{T}}}}} \end{matrix}$ where N⁻¹ is the number of vectors corresponding to class ω₂ that can be obtained from the training sequence. In general, when L=2, the first two training samples may not generate the corresponding received vectors since y(−1) and y(−2) are required to do so. Although one might use arbitrary values for y(−1) and y(−2), it might be better not to use them. Similarly, when L=2, the last two training samples does not generate the corresponding received vectors since y(101) and y(102) are required to do so, assuming that there are 100 training samples. FIG. 3 illustrates how Y₁(i) and Y⁻¹(i) are constructed. As explained previously, the first two samples do not generate output vectors. Since x(3)=−1, the corresponding vector will belong to Y⁻¹. In other words, Y⁻¹(1)=[−1.43,−2.15,−1.79,−1.4,0.21]^(T). Since x(4)=−1, the corresponding vector will also belong to Y⁻¹ With Y⁻¹(2)=[−2.15,−1.79,−1.4,0.21,−0.03]^(T). Similarly, the following received vectors are Y₁(1)=[−1.79,−1.4,0.21,−0.03,0.96]^(T), Y⁻¹(3)=[−1.4,0.21,−0.03,0.96,1.01]^(T), and Y₁(2)=[0.21,−0.03,0.96,1.01,0.31]^(T). It is noted that the indices of Y₁ and Y⁻¹ are different from those of Y.

If a linear dispersive channel is assumed, {circumflex over (Σ)}_(Y) ₁ and {circumflex over (Σ)}_(Y) ⁻¹ should be identical and {circumflex over (μ)}_(Y) ₁ =−{circumflex over (μ)}_(Y) ⁻¹ . However, in real world problems, these estimations might be different. Thus, one might take the average of {circumflex over (Σ)}_(Y) ₁ and {circumflex over (Σ)}_(Y) ⁻¹ as follows: {circumflex over (Σ)}_(Y)=({circumflex over (Σ)}_(Y) ₁ +{circumflex over (Σ)}_(Y) ⁻¹ )/2.

Finally, the linear equalizer is obtained by w={circumflex over (Σ)}_(Y) ⁻¹({circumflex over (μ)}_(Y) ₁ −{circumflex over (μ)}_(Y) ⁻¹ ).  (6)

It is noted that equation (6) can be used for any type of channel. In other words, the channel does not have to be linear. It can be a non-symmetric channel or a non-linear channel. If there are more than 2 states, the idea and teaching of the present invention can be easily extended. As explained previously, the decision rule is as follows:

-   -   if w^(T)Y(n)>0, then decide that state 1 is transmitted,     -   if w^(T)Y(n)<0, then decide that state −1 is transmitted.

If w^(T)Y(n)=0, one can decide the received signal as either state 1 or state −1. In practice, such a case would be rare.

Embodiment 3

According to the teaching and idea of the present invention, equalization is viewed as a classification problem where the equalizer classifies the received vector Y(n) as one of the binary states (1, −1). This idea can be extended to multi-states. In other words, instead of viewing equalization as a classification problem where the equalizer classifies the received vector Y(n) as one of the binary states, equalization can be viewed as a multi-class classification problem. For instance, equalization might be viewed as an 8 class classification problem when 3 bits are considered. In other words, if state (1,1,1) is transmitted (class ω₁), the input vector is given by X ₁ :X(n)=[x(n−5),x(n−4),x(n−3),x(n−2),1,1,1,x(n+2),x(n+3),x(n+4),x(n+5)]^(T), assuming L=2. It is noted that (1,1,1) represents the values of x(n−1),x(n),x(n+1), respectively. Similarly, when the remaining 7 states are transmitted, the corresponding input vectors are given by X ₂ :X(n)=[x(n−5),x(n−4),x(n−3),x(n−2),1,1,−1,x(n+2),x(n+3),x(n+4),x(n+5)]^(T) X ₃ :X(n)=[x(n−5),x(n−4),x(n−3),x(n−2),1,−1,1,x(n+2),x(n+3),x(n+4),x(n+5)]^(T) X ₄ :X(n)=[x(n−5),x(n−4),x(n−3),x(n−2),1,−1,−1,x(n+2),x(n+3),x(n+4),x(n+5)]^(T) X ₅ :X(n)=[x(n−5),x(n−4),x(n−3),x(n−2),−1,1,1,x(n+2),x(n+3),x(n+4),x(n+5)]^(T) X ₆ :X(n)=[x(n−5),x(n−4),x(n−3),x(n−2),−1,1,−1,x(n+2),x(n+3),x(n+4),x(n+5)]^(T) X ₇ :X(n)=[x(n−5),x(n−4),x(n−3),x(n−2),−1,−1,1,x(n+2),x(n+3),x(n+4),x(n+5)]^(T) X ₈ :X(n)=[x(n−5),x(n−4),x(n−3),x(n−2),−1,−1,−1,x(n+2),x(n+3),x(n+4),x(n+5)]^(T)

It is noted that random vectors X₁, X₂, X₃, X₄, X₅, X₆, X₇ and X₈ correspond to class ω₁, class ω₂, class ω₃, class ω₄, class ω₅, class ω₆, class ω₇ and class ω₈, respectively.

Assuming L=2 in equation (1), {y(n−3), y(n−2), y(n−1), y(n), y(n+1), y(n+2), y(n+3)} will be affected by x(n−1), x(n) and x(n+1). In other words, {y(n−3), y(n−2), y(n−1), y(n), y(n+1), y(n+2), y(n+3) } contain some information on x(n−1), x(n) and x(n+1) and can be used for equalization. Furthermore, if state (1,1,1) is transmitted, the received vector can be computed as

$\begin{matrix} {{Y_{1}\text{:}\mspace{14mu}{Y(n)}} = \begin{bmatrix} {y\left( {n - 3} \right)} \\ {y\left( {n - 2} \right)} \\ {y\left( {n - 1} \right)} \\ {y(n)} \\ {y\left( {n + 1} \right)} \\ {y\left( {n + 2} \right)} \\ {y\left( {n + 3} \right)} \end{bmatrix}} \\ {= \begin{bmatrix} a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} \end{bmatrix}} \\ {\begin{bmatrix} {x\left( {n - 5} \right)} \\ {x\left( {n - 4} \right)} \\ {x\left( {n - 3} \right)} \\ {x\left( {n - 2} \right)} \\ 1 \\ 1 \\ 1 \\ {x\left( {n + 2} \right)} \\ {x\left( {n + 3} \right)} \\ {x\left( {n + 4} \right)} \\ {x\left( {n + 5} \right)} \end{bmatrix} + \begin{bmatrix} {e\left( {n - 3} \right)} \\ {e\left( {n - 2} \right)} \\ {e\left( {n - 1} \right)} \\ {e(n)} \\ {e\left( {n + 1} \right)} \\ {e\left( {n + 2} \right)} \\ {e\left( {n + 3} \right)} \end{bmatrix}} \end{matrix}$

On the other hand, the mean vector and covariance matrix of input vector X₁ are given by μ_(X) ₁ =E{X ₁}=[0,0,0,0,1,11,0,0,0,0]^(T) Σ_(X) ₁ =E{(X ₁−μ_(X) ₁ )(X ₁−μ_(X) ₁ )^(T)}=Diag(1,1,1,1,0,0,0,1,1,1,1).

As stated previously, the mean vector and covariance matrix of received vector Y₁ are given by μ_(Y) ₁ =Aμ _(X) ₁ =[a ₂ ,a ₁ ,+a ₂ ,a ₀ +a ₁ ,+a ₂,2a ₀ +a ₁ ,a ₀ +a ₁+a₂ ,a ₁ +a ₂ ,a ₂]^(T) Σ_(Y) ₁ =AΣ _(X) ₁ A ^(T)σ_(e) ²1 where

$A = {\begin{bmatrix} a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & a_{2} & a_{1} & a_{0} & a_{1} & a_{2} \end{bmatrix}.}$

The input vectors, received vectors, mean vectors and covariance matrices of the other states can be obtained similarly. Assuming that the Gaussian distribution for each class, the following decision rule can be used: Decide ω_(j) if g _(j)(Y)>g _(k)(Y) for all k≠j  (7) where g _(j)(Y)=−1n(|Σ_(Y) _(j) |)−(Y−μ_(Y) _(j) )Σ_(Y) _(j) ⁻¹(Y−μ_(Y) _(j) )^(T).

If the covariance matrices are identical, the following rule can be further simplified as follows: Decide ω_(j) if g _(j)(Y)>g _(k)(Y) for all k≠j  (8) where g _(j)(Y)=2YΣ _(Y) _(j) ⁻¹μ_(Y) _(j) ^(T)−μ_(Y) _(j) Σ_(Y) _(j) ⁻¹μ_(Y) _(j) ^(T)=2μ_(Y) _(j) Σ_(Y) _(j) ⁻¹ Y ^(T)−μ_(Y) _(j) Σ_(Y) _(j) ⁻¹μ_(Y) _(j) ^(T).

It is noted that the above decision rules can be used even when one can not assume the Gaussian distribution for each class.

In general, the coefficients {a_(k)} and the noise power σ_(e) ² are unknown in many real world problems. As in the previous cases, all the parameters can be estimated from a training sequence. For instance, the mean vector and the covariance matrix of Y₁ can be estimated as follows:

$\begin{matrix} {{\hat{\mu}}_{Y_{1}} = {\frac{1}{N_{1}}{\sum\limits_{i = 1}^{N_{1}}\;{Y_{1}(i)}}}} \\ {{\hat{\Sigma}}_{Y_{1}} = {\frac{1}{N_{1}}{\sum\limits_{i = 1}^{N_{1}}{\left( \;{{Y_{1}(i)} - {\hat{\mu}}_{Y_{1}}} \right)\left( \;{{Y_{1}(i)} - {\hat{\mu}}_{Y_{1}}} \right)^{T}}}}} \end{matrix}$ where Y₁(i) is the i-th received vector of class ω₁ (state (1,1,1) is transmitted), the subscript of Y₁(i) indicates that state (1,1,1) is transmitted and N₁ is the number of vectors corresponding to class ω₁ that can be obtained from the training sequence. The mean vectors and the covariance matrices of the other received vectors corresponding to the other states can be estimated similarly. In other words, the mean vector and covariance matrix of class ω_(j) can be computed as follows:

$\begin{matrix} {{\hat{\mu}}_{Y_{j}} = {\frac{1}{N_{j}}{\sum\limits_{i = 1}^{N_{j}}\;{Y_{j}(i)}}}} \\ {{\hat{\Sigma}}_{Y_{j}} = {\frac{1}{N_{j}}{\sum\limits_{i = 1}^{N_{j}}{\left( \;{{Y_{j}(i)} - {\hat{\mu}}_{Y_{j}}} \right){\left( \;{{Y_{j}(i)} - {\hat{\mu}}_{Y_{j}}} \right)^{T}.}}}}} \end{matrix}$

By treating three received samples as one state, the above equalizer can classify three samples at one time (FIG. 4 a). However, in order to enhance the performance, the above equalizer can classify one sample at a time. In this scheme, states {(1,1,1), (1,1,−1), (−1,1,1), (−1,1,−1)} indicate that state 1 is transmitted and states {(1,−1,1), (1,−1,−1), (−1,−1,1), (−1,−1,−1)} indicate that state −1 is transmitted. This operation is illustrated in FIG. 4 b.

In the above illustration, equalization is viewed as an 8-class classification problem by considering 3 bits. It is noted that by considering a different number of bits, the number of classes can be changed. For instance, if 5 bits are considered, equalization can be viewed as a 32-class classification problem.

Embodiment 4

The idea and teaching of the present can be applied to blind equalization. In many real world problems, not only the coefficients {a_(k)} and the noise powers σ_(e) ² are unknown, but also a training sequence is not available. In this case, the receiver should estimate those parameters without training samples. By analyzing received signals, the receiver estimates the channel characteristics using unlabeled received samples. The idea and teaching of the present can be applied to this kind of problem. For instance, the equalizer starts with initial parameters {N_(j) ^(initial),μ_(j) ^(initial),Σ_(j) ^(initial)} where N_(j) ^(initial) is the initially assumed number of training samples of class ω_(j), μ_(j) ^(initial) is the initial estimator of the mean vector of class ω_(j), and Σ_(j) ^(initial) is the initial estimator of the covariance matrix of class ω_(j). Then, the equalizer classifies an incoming sample and uses the result to update the statistics of the corresponding class. In order to recursively update the parameters (mean vectors and covariance matrices), the following statistics are computed:

$\begin{matrix} {{{SUMY}_{j}\left( N_{j} \right)} = {\sum\limits_{i = 1}^{N_{j}}\;{Y_{j}(i)}}} \\ {{{SUMYY}_{j}\left( N_{j} \right)} = {\sum\limits_{i = 1}^{N_{j}}\;{{Y_{j}(i)}{Y_{j}(i)}^{T}}}} \end{matrix}$ where Y_(j)(i) is a vector. With the initial parameters, SUMY_(j) and SUMYY_(j) of class ω_(j) are computed as follows:

$\begin{matrix} {{{SUMY}_{j}\left( N_{j} \right)} = {{SUMY}_{j}^{initial} + {\sum\limits_{i = {N_{j}^{initial} + 1}}^{N_{j}}\;{Y_{j}(i)}}}} \\ {{{SUMYY}_{j}\left( N_{j} \right)} = {{SUMYY}_{j}^{initial} + {\sum\limits_{i = {N_{j}^{initial} + 1}}^{N_{j}}\;{{Y_{j}(i)}{Y_{j}(i)}^{T}}}}} \end{matrix}$ where N_(j)=N_(j) ^(initial)+N_(j) ^(actual), N_(j) ^(actual) is the actual number of samples that are classified as class ω_(j), Y_(j)(i) for i>N_(j) ^(initial) is the received samples that are classified as class ω_(j), SUMY_(j) ^(initial)=N_(j) ^(initial)μ_(j) ^(initial)μ_(j) ^(initial), and SUMYY_(j) ^(initial)=(N_(j) ^(initial)−1)Σ_(j) ^(initial)+N_(j) ^(initial)μ_(j) ^(initial)(μ_(j) ^(initial))^(T).

Then, provided that there are N_(j) training samples for class ω_(j), the mean vector and covariance matrix of class ω_(j) can be computed as follows:

${{\hat{\mu}}_{j}\left( N_{j} \right)} = {\frac{1}{N_{j}}{{SUMY}_{j}\left( N_{j} \right)}}$ ${\hat{\sum\limits_{j}}\;\left( N_{j} \right)} = {{\frac{1}{N_{j} - 1}{{SUMYY}_{j}\left( N_{j} \right)}} - {\frac{N_{j}}{N_{j} - 1}{{\hat{\mu}}_{j}\left( N_{j} \right)}{{{\hat{\mu}}_{j}\left( N_{j} \right)}^{T}.}}}$

It is noted that N_(j)=N_(j) ^(initial)+N_(j) ^(actual) where N_(j) ^(actual) is the actual number of samples that are classified as class ω_(j). Thus, if N_(j) ^(initial) is small, the mean vector and covariance matrix of class ω_(j) will change rapidly as new samples are added to update the statistics. On the other hand, if N_(j) ^(initial) is large, the mean vector and covariance matrix of class ω_(j) will change more slowly.

As described previously, the following decision rule can be used: Decide ω_(j) if g _(j)(Y)>g _(k)(Y) for all k≠j  (9) where g _(j)(Y)=−1n(|{circumflex over (Σ)}_(j)|)−(Y−{circumflex over (μ)} _(j)){circumflex over (Σ)}_(j) ⁻¹(Y−{circumflex over (μ)} _(j))^(T).

If the covariance matrices are identical, the following rule can be further simplified as follows: Decide ω_(j) if g _(j)(Y)<g _(k)(Y) for all k≠j  (10) where g _(j)(Y)=2Y{circumflex over (Σ)} ⁻¹μ_(j) ^(T)−μ_(j){circumflex over (Σ)}⁻¹μ_(j) ^(T)=2{circumflex over (μ)}_(j){circumflex over (Σ)}⁻¹ Y ^(T)−{circumflex over (μ)}_(j){circumflex over (Σ)}⁻¹{circumflex over (μ)}_(j) ^(T).

In order to recursively update the parameters, the equalizer classifies an incoming sample with the current parameters and uses the result to update the statistics of the corresponding class. For example, if the incoming sample is classified as class ω_(j), SUMY_(j) and SUMYY_(j) are updated as follows:

${{SUMY}_{j}\left( {N_{j} + 1} \right)} = {{SUMY}_{j}^{initial} + {\sum\limits_{i = {N_{j}^{initial} + 1}}^{N_{j} + 1}{Y_{j}(i)}}}$ ${{SUMYY}_{j}\left( {N_{j} + 1} \right)} = {{SUMYY}_{j}^{initial} + {\sum\limits_{i = {N_{j}^{initial} + 1}}^{N_{j} + 1}{{Y_{j}(i)}{Y_{j}^{T}(i)}}}}$ where Y_(j)(N_(j)+1) is the current incoming sample. ω_(j), In practice, SUMY_(j) and SUMYY_(j) can be more efficiently updated as follows: SUMY_(j)(N _(j)+1)=SUMY_(j)(N _(j))+Y _(j)(N _(j)+1) SUMYY_(j)(N _(j)+1)=SUMYY_(j)(N _(j))+Y_(j)(N _(j)+1)Y _(j)(N _(j)+1)^(T).

Then, the mean vector and covariance matrix are updated as follows:

${{\hat{\mu}}_{j}\left( {N_{j} + 1} \right)} = {\frac{1}{N_{j} + 1}{{SUMY}_{j}\left( {N_{j} + 1} \right)}}$ $\begin{matrix} {{\hat{\sum\limits_{j}}\left( {N_{j} + 1} \right)} = {{\frac{1}{N_{j}}{{SUMYY}_{j}\left( {N_{j} + 1} \right)}} -}} \\ {\frac{N_{j} + 1}{N_{j}}{{\hat{\mu}}_{j}\left( {N_{j} + 1} \right)}{{\hat{\mu}}_{j}\left( {N_{j} + 1} \right)}{T^{T}.}} \end{matrix}$

Finally, the number of training samples of class ω_(j) is updated as follows: N_(j)

N_(j)+1.

Sometimes, the confidence level of classifying incoming samples is low. Such samples might not be used to update the statistics of the corresponding class. In other words, one updates the parameters using only the samples that are classified with a high level of confidence. 

1. A linear equalization method for a linear dispersive channel with known channel characteristics, characterized by ${{y(n)} = {{\sum\limits_{k = {- L_{1}}}^{L_{2}}{\alpha_{k}{x\left( {n - k} \right)}}} + {e(n)}}},$ comprising the steps of: (a) constructing a vector from received signals as follows: Y(n)=[y(n−L ₁),y(n−L ₁+1), . . . ,y(n), . . . ,(n), . . . ,y(n+L ₂−1),y(n+L ₂)]^(T); (b) computing a weight vectors as follows: w=Σ_(Y) ⁻¹(μ_(Y) ₁ −μ_(Y) ₂ ) where ${\mu_{X_{1}} = \left\lbrack {\underset{\underset{L_{1}\mspace{14mu}{zeros}}{︸}}{0,\ldots\mspace{11mu},0},1,\underset{\underset{L_{2}\mspace{14mu}{zeros}}{︸}}{0,\ldots\mspace{11mu},0}} \right\rbrack^{T}},{\mu_{X_{2}} = \left\lbrack {\underset{\underset{L_{1}\mspace{14mu}{zeros}}{︸}}{0,\ldots\mspace{11mu},0},{- 1},\underset{\underset{L_{2}\mspace{14mu}{zeros}}{︸}}{0,\ldots\mspace{11mu},0}} \right\rbrack^{T}},$ Σ_(X) ₁ =I−μ _(X) ₁ μ_(X) ₁ ^(T), μ_(Y) ₁ =Aμ_(X) ₁ , μ_(Y) ₂ =Aμ_(X) ₂ , Σ_(Y) =AΣ _(X) ₁ A ^(T)+σ_(e) ² I, $A = {\quad{\left\lbrack \begin{matrix} a_{- L_{1}} & a_{{- L_{1}} + 1} & \ldots & a_{L_{2} - 1} & a_{L_{2}} & \ldots & 0 & 0 & 0 \\ 0 & a_{- L_{1}} & a_{{- L_{1}} + 1} & \ldots & a_{L_{2 - 1}} & a_{L_{2}} & \ldots & 0 & 0 \\ \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots \\ 0 & 0 & \ldots & a_{- L_{1}} & a_{{- L_{1}} + 1} & \ldots & a_{L_{2} - 1} & a_{L_{2}} & 0 \\ 0 & 0 & 0 & \ldots & a_{- L_{1}} & a_{{- L_{1}} + 1} & \ldots & a_{L_{1} - 1} & a_{L_{21}} \end{matrix} \right\rbrack,}}$ I is an identity matrix, and σ_(e) ² is the noise power; and (c) determining a received signal according to the following rule: if w^(T)Y(n)>0, then decide that state 1 is transmitted, if w^(T)Y(n)<0, then decide that state −1 is transmitted.
 2. A linear equalization method with unknown channel characteristics, comprising the steps of: (a) constructing a class 1 vector from training signals when an input signal corresponding to y(n) is state 1, as follows: Y ₁(n)=[y(n−L ₁),y(n−L ₁+1), . . . ,y(n), . . . ,y(n+L ₂−1),y(n+L ₂)]^(T); (b) constructing a class 2 vector from training signals as follows when an input signal corresponding to y(n) is state −1: Y ⁻¹(n)=[y(n−L ₁),y(n−L ₁+1), . . . ,y(n), . . . ,y(n+L ₂−1),y(n+L ₂)]^(T); (c) computing a weight vectors as follows: w={circumflex over (Σ)}_(Y) ⁻¹({circumflex over (μ)}_(Y) ₁ −{circumflex over (μ)}_(Y) ₂ ) where ${\hat{\mu}}_{Y_{1}} = {\frac{1}{N_{1}}{\sum\limits_{i = 1}^{N_{1}}{Y_{1}(i)}}}$ (N₁: the number of training vectors belonging to state 1), ${\hat{\underset{Y_{1}}{{\sum =}\;}}{\frac{1}{N_{1}}{\sum\limits_{i = 1}^{N_{1}}{\left( {{Y_{1}(i)} - {\hat{\mu}}_{Y_{1}}} \right)\left( {{Y_{1}(i)} - {\hat{\mu}}_{Y_{1}}} \right)^{T}}}}},$ ${\hat{\mu}}_{Y_{2}} = {\frac{1}{N_{2}}{\sum\limits_{i = 1}^{N_{2}}{Y_{2}(i)}}}$ (N₂: the number of training vectors belonging to state −1), ${\hat{\underset{Y_{2}}{{\sum =}\;}}{\frac{1}{N_{2}}{\sum\limits_{i = 1}^{N_{2}}{\left( {{Y_{2}(i)} - {\hat{\mu}}_{Y_{2}}} \right)\left( {{Y_{2}(i)} - {\hat{\mu}}_{Y_{2}}} \right)^{T}}}}},$ and {circumflex over (Σ)}_(Y)=({circumflex over (Σ)}_(Y) ₁ +{circumflex over (Σ)}_(Y) ₂ )/2; and (d) determining a received signal according to the following rule: if w^(T)Y(n)>0, then decide that state 1 is transmitted, if w^(T)Y(n)<0, then decide that state −1 is transmitted.
 3. A blind linear equalization method, comprising the steps of: (a) assigning initial parameters {N_(j) ^(initial),μ_(j) ^(initial),Σ_(j) ^(initial)} where N_(j) ^(initial) is the initially assumed number of training samples of class ω_(j), μ_(j) ^(initial) the initial estimator of the mean vector of class ω_(j), and Σ_(j) ^(initial) the initial estimator of the covariance matrix of class ω_(j); (b) estimating the mean vector and covariance of class ω_(j) as follows: ${{{\hat{\mu}}_{j}\left( N_{j} \right)} = {\frac{1}{N_{j}}{{SUMY}_{j}\left( N_{j} \right)}}},{{{\hat{\Sigma}}_{j}\left( N_{j} \right)} = {{\frac{1}{N_{j} - 1}{{SUMYY}_{j}\left( N_{j} \right)}} - {\frac{N_{j}}{N_{j} - 1}{{\hat{\mu}}_{j}\left( N_{j} \right)}{{\hat{\mu}}_{j}\left( N_{j} \right)}^{T}}}},$ where N _(j) =N _(j) ^(initial) +N _(j) ^(initial), N_(j) ^(initial) is the actual number of samples that are classified as class ω_(j), ${{{SUMY}_{j}\left( N_{j} \right)} = {{SUMY}_{j}^{initial} + {\sum\limits_{i = {N_{j}^{initial} + 1}}^{N_{j}}\;{Y_{j}(i)}}}},{\mspace{11mu}\;}{and}$ ${{{SUMYY}_{j}\left( N_{j} \right)} = {{SUMYY}_{j}^{initial} + {\sum\limits_{i = {N_{j}^{initial} + 1}}^{N_{j}}{{Y_{j}(i)}{Y_{j}(i)}^{T}}}}};$ (c) determining a received signal according to the following rule: if g _(j)(Y(n))>g _(k)(Y(n)) for all k≠j, then decide ω_(j) where g _(j)(Y(n))=−1n(|{circumflex over (Σ)}_(j)|)−(Y(n)−{circumflex over (μ)}_(j)){circumflex over (Σ)}_(Y) _(j) ⁻¹(Y(n)−{circumflex over (μ)}_(j))^(T); (d) updating the parameters as follows: SUMY_(j)(N _(j)+1)=SUMY_(j)(N _(j))+Y _(j)(N _(j)+1), SUMYY_(j)(N _(j)+1)=SUMYY_(j)(N _(j))+Y _(j)(N _(j)+1)Y _(j)(N _(j)+1)^(T), N_(j)

N_(j)+1 where Y_(j)(N_(j)+1) is Y(n) that is the current incoming vector, which is classified as class ω_(j).
 4. A linear equalization method that views equalization as a multiclass classification problem by considering a plurality of bits simultaneously, comprising the steps of: (a) constructing a received vector from received training signals as follows: Y(n)=[y(n−L ₁),y(n−L ₁+1), . . . ,y(n), . . . ,y(n+L ₂−1),y(n+L ₂)]^(T); (b) assigning said received vector to one of the classes, each of which has a unique bit pattern of said plurality of bits and thereby generating a set of training vectors for each class; (c) estimating a mean vector from said set of vectors for each class as follows: ${\hat{\mu}}_{\gamma_{j}} = {\frac{1}{N_{j}}{\sum\limits_{i = 1}^{N_{j}}{Y_{j}(i)}}}$ where {circumflex over (μ)}_(Y) _(j) is an estimated mean vector of class ω_(j), Y_(j)(i) is the i-th training vector of class ω_(j), and N_(j) is the number of vectors corresponding to class ω_(j); (d) estimating a covariance matrix from said set of vectors for each class as follows: ${\hat{\Sigma}}_{\gamma_{j}} = {\frac{1}{N_{j}}{\sum\limits_{i = 1}^{N_{j}}{\left( {{Y_{j}(i)} - {\hat{\mu}}_{Y_{j}}} \right)\left( {{Y_{j}(i)} - {\hat{\mu}}_{Y_{j}}} \right)^{T}}}}$ where {circumflex over (Σ)}_(Y) _(j) is an estimated covariance matrix of class ω_(j); and (e) determining a received signal according to the following rule: if g _(j)(Y(n))>g _(k)(Y(n)) for all k≠j, then decide ω_(j) where g _(j)(Y(n))=−1n(|{circumflex over (Σ)}_(j)|)−(Y(n)−{circumflex over (μ)}_(j)){circumflex over (Σ)}_(Y) _(j) ⁻¹(Y(n)−{circumflex over (μ)}_(j))^(T), {circumflex over (μ)}_(j) is said estimated mean vector of class ω_(j), and {circumflex over (Σ)}_(j) is said estimated covariance matrix of class ω_(j). 